90 research outputs found
A User Study for Evaluation of Formal Verification Results and their Explanation at Bosch
Context: Ensuring safety for any sophisticated system is getting more complex
due to the rising number of features and functionalities. This calls for formal
methods to entrust confidence in such systems. Nevertheless, using formal
methods in industry is demanding because of their lack of usability and the
difficulty of understanding verification results. Objective: We evaluate the
acceptance of formal methods by Bosch automotive engineers, particularly
whether the difficulty of understanding verification results can be reduced.
Method: We perform two different exploratory studies. First, we conduct a user
survey to explore challenges in identifying inconsistent specifications and
using formal methods by Bosch automotive engineers. Second, we perform a
one-group pretest-posttest experiment to collect impressions from Bosch
engineers familiar with formal methods to evaluate whether understanding
verification results is simplified by our counterexample explanation approach.
Results: The results from the user survey indicate that identifying refinement
inconsistencies, understanding formal notations, and interpreting verification
results are challenging. Nevertheless, engineers are still interested in using
formal methods in real-world development processes because it could reduce the
manual effort for verification. Additionally, they also believe formal methods
could make the system safer. Furthermore, the one-group pretest-posttest
experiment results indicate that engineers are more comfortable understanding
the counterexample explanation than the raw model checker output. Limitations:
The main limitation of this study is the generalizability beyond the target
group of Bosch automotive engineers.Comment: This manuscript is under review with the Empirical Software
Engineering journa
Runtime Verification of Self-Adaptive Systems with Changing Requirements
To accurately make adaptation decisions, a self-adaptive system needs precise
means to analyze itself at runtime. To this end, runtime verification can be
used in the feedback loop to check that the managed system satisfies its
requirements formalized as temporal-logic properties. These requirements,
however, may change due to system evolution or uncertainty in the environment,
managed system, and requirements themselves. Thus, the properties under
investigation by the runtime verification have to be dynamically adapted to
represent the changing requirements while preserving the knowledge about
requirements satisfaction gathered thus far, all with minimal latency. To
address this need, we present a runtime verification approach for self-adaptive
systems with changing requirements. Our approach uses property specification
patterns to automatically obtain automata with precise semantics that are the
basis for runtime verification. The automata can be safely adapted during
runtime verification while preserving intermediate verification results to
seamlessly reflect requirement changes and enable continuous verification. We
evaluate our approach on an Arduino prototype of the Body Sensor Network and
the Timescales benchmark. Results show that our approach is over five times
faster than the typical approach of redeploying and restarting runtime monitors
to reflect requirements changes, while improving the system's trustworthiness
by avoiding interruptions of verification.Comment: 18th Symposium on Software Engineering for Adaptive and Self-Managing
Systems (SEAMS 2023
A Property Specification Pattern Catalog for Real-Time System Verification with UPPAAL
Context: The goal of specification pattern catalogs for real-time
requirements is to mask the complexity of specifying such requirements in a
timed temporal logic for verification. For this purpose, they provide frontends
to express and translate pattern-based natural language requirements to
formulae in a suitable logic. However, the widely used real-time model checking
tool UPPAAL only supports a restricted subset of those formulae that focus only
on basic and non-nested reachability, safety, and liveness properties. This
restriction renders many specification patterns inapplicable. As a workaround,
timed observer automata need to be constructed manually to express
sophisticated requirements envisioned by these patterns. Objective: In this
work, we fill these gaps by providing a comprehensive specification pattern
catalog for UPPAAL. The catalog supports qualitative and real-time requirements
and covers all corresponding patterns of existing catalogs. Method: The catalog
we propose is integrated with UPPAAL. It supports the specification of
qualitative and real-time requirements using patterns and provides an automated
generator that translates these requirements to observer automata and TCTL
formulae. The resulting artifacts are used for verifying systems in UPPAAL.
Thus, our catalog enables an automated end-to-end verification process for
UPPAAL based on property specification patterns and observer automata. Results:
We evaluate our catalog on three UPPAAL system models reported in the
literature and mostly applied in an industrial setting. As a result, not only
the reproducibility of the related UPPAAL models was possible, but also the
validation of an automated, seamless, and accurate pattern- and observer-based
verification process. Conclusion: The proposed property specification pattern
catalog for UPPAAL enables practitioners to specify qualitative and real-time
requirements...Comment: Accepted Manuscrip
Tests4Py: A Benchmark for System Testing
Benchmarks are among the main drivers of progress in software engineering
research, especially in software testing and debugging. However, current
benchmarks in this field could be better suited for specific research tasks, as
they rely on weak system oracles like crash detection, come with few unit tests
only, need more elaborative research, or cannot verify the outcome of system
tests.
Our Tests4Py benchmark addresses these issues. It is derived from the popular
BugsInPy benchmark, including 30 bugs from 5 real-world Python applications.
Each subject in Tests4Py comes with an oracle to verify the functional
correctness of system inputs. Besides, it enables the generation of system
tests and unit tests, allowing for qualitative studies by investigating
essential aspects of test sets and extensive evaluations. These opportunities
make Tests4Py a next-generation benchmark for research in test generation,
debugging, and automatic program repair.Comment: 5 pages, 4 figure
Grammar-based fuzzing of data integration parsers in computational materials science
Context
Computational materials science (CMS) focuses on in silico experiments to compute the properties of known and novel materials, where many software packages are used in the community. The NOMAD Laboratory (Draxl C, Scheffler) offers to store the input and output files in its FAIR data repository. Since the file formats of these software packages are non-standardized, parsers are used to provide the results in a normalized format.
Objective
The main goal of this article is to report experience and findings of using grammar-based fuzzing on these parsers.
Method
We have constructed an input grammar for four common software packages in the CMS domain and performed an experimental evaluation on the capabilities of grammar-based fuzzing to detect failures in the Novel Materials Discovery (NOMAD) parsers.
Results
With our approach, we were able to identify three unique critical bugs concerning service availability, as well as several additional syntactic, semantic, logical, and downstream bugs in the investigated NOMAD parsers. We reported all issues to the developer team prior to publication.
Conclusion
Based on the experience gained, we can recommend grammar-based fuzzing also for other research software packages to improve the trust level in the correctness of the produced results.Peer Reviewe
Formal Synthesis of Uncertainty Reduction Controllers
In its quest for approaches to taming uncertainty in self-adaptive systems
(SAS), the research community has largely focused on solutions that adapt the
SAS architecture or behaviour in response to uncertainty. By comparison,
solutions that reduce the uncertainty affecting SAS (other than through the
blanket monitoring of their components and environment) remain underexplored.
Our paper proposes a more nuanced, adaptive approach to SAS uncertainty
reduction. To that end, we introduce a SAS architecture comprising an
uncertainty reduction controller that drives the adaptive acquisition of new
information within the SAS adaptation loop, and a tool-supported method that
uses probabilistic model checking to synthesise such controllers. The
controllers generated by our method deliver optimal trade-offs between SAS
uncertainty reduction benefits and new information acquisition costs. We
illustrate the use and evaluate the effectiveness of our approach for mobile
robot navigation and server infrastructure management SAS
QuantUM: Quantitative Safety Analysis of UML Models
When developing a safety-critical system it is essential to obtain an
assessment of different design alternatives. In particular, an early safety
assessment of the architectural design of a system is desirable. In spite of
the plethora of available formal quantitative analysis methods it is still
difficult for software and system architects to integrate these techniques into
their every day work. This is mainly due to the lack of methods that can be
directly applied to architecture level models, for instance given as UML
diagrams. Also, it is necessary that the description methods used do not
require a profound knowledge of formal methods. Our approach bridges this gap
and improves the integration of quantitative safety analysis methods into the
development process. All inputs of the analysis are specified at the level of a
UML model. This model is then automatically translated into the analysis model,
and the results of the analysis are consequently represented on the level of
the UML model. Thus the analysis model and the formal methods used during the
analysis are hidden from the user. We illustrate the usefulness of our approach
using an industrial strength case study.Comment: In Proceedings QAPL 2011, arXiv:1107.074
- …